IVA 510

Çapraz Doğrulama ve Bootstrapping

I. Ozkan

Bahar 2025

Ön Okumalar

Öğrenme Hedefleri

Doğrulama Yaklaşımı

Gösterim için Bir Örnek

ISLR Paketi auto Data
mpg cylinders displacement horsepower weight acceleration year origin name
18 8 307 130 3504 12.0 70 1 chevrolet chevelle malibu
15 8 350 165 3693 11.5 70 1 buick skylark 320
18 8 318 150 3436 11.0 70 1 plymouth satellite
16 8 304 150 3433 12.0 70 1 amc rebel sst
17 8 302 140 3449 10.5 70 1 ford torino
15 8 429 198 4341 10.0 70 1 ford galaxie 500

Gösterim için Bir Örnek

Gösterim için Bir Örnek

Örnek

Minimum MSE Değerleri ve Polinom Derecesi, Tüm Örneklemler
Polinom Derecesi Min. Eğitim MSE Min. Test MSE
1 22.291 20.416
2 17.493 16.767
3 17.350 16.701
4 17.273 16.674
5 16.817 16.025
6 16.674 15.710
7 16.529 15.381
8 16.529 15.420
9 16.522 15.667
10 16.517 16.088

Birini-Dışarda-Bırak Çapraz Doğrulama (LOOCV)

\(CV_{(n)} = \frac{1}{n}\sum^n_{i=1}MSE_i\)

Doğrusal model için LOOCV tahmini,

\(mpg_i=\beta_0+\beta_1horsepower_i+\varepsilon_i\) için: 24.232

k-Fold Çapraz Doğrulama

\(CV_{(k)} = \frac{1}{k}\sum^k_{j=1}MSE_j\)

k-Fold
k-Fold

Örnekleme (Bootstrapping)

\(SE_B(\hat\alpha) = \sqrt{\frac{1}{B-1}\sum^B_{r=1}\bigg(\hat\alpha^{*r}-\frac{1}{B}\sum^B_{r'=1}\hat\alpha^{*r}\bigg)^2}\)

Örnekleme (Bootstrap): Örnek (ISLR, page 187)

\(\hat\alpha = \frac{\hat\sigma^2_Y - \hat\sigma_{XY}}{\hat\sigma^2_X +\hat\sigma^2_Y-2\hat\sigma_{XY}}\)

Bootstrap Alpha
1 0.4483
2 0.5609
3 0.5053
4 0.6836
5 0.6108
6 0.5820
7 0.5013
8 0.5379
9 0.6151
10 0.5374


ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = Portfolio, statistic = statistic, R = 1000)


Bootstrap Statistics :
     original      bias    std. error
t1* 0.5758321 0.004719558  0.09020046

Örnekleme (Bootstrap) Regresyon Örneği

\(ln(wage)=\beta_0 + \beta_1 \: experience+ \beta_2 \: experience^2 + \beta_3 \: education + \beta_4 \: ethnicity + \varepsilon\)


Call:
lm(formula = log(wage) ~ experience + I(experience^2) + education + 
    ethnicity, data = CPS1988)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.9428 -0.3162  0.0580  0.3756  4.3830 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)      4.321e+00  1.917e-02  225.38   <2e-16 ***
experience       7.747e-02  8.800e-04   88.03   <2e-16 ***
I(experience^2) -1.316e-03  1.899e-05  -69.31   <2e-16 ***
education        8.567e-02  1.272e-03   67.34   <2e-16 ***
ethnicityafam   -2.434e-01  1.292e-02  -18.84   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.5839 on 28150 degrees of freedom
Multiple R-squared:  0.3347,    Adjusted R-squared:  0.3346 
F-statistic:  3541 on 4 and 28150 DF,  p-value: < 2.2e-16
Regresyon Katsayısı CI
Değişken 2.5% 97.5%
(Intercept) 4.28381 4.35898
experience 0.07575 0.07920
I(experience^2) -0.00135 -0.00128
education 0.08318 0.08817
ethnicityafam -0.26868 -0.21804
Örnekleme (Bootstrapped) Regresyon Katsayıları CI
Değişken 2.5% 97.5%
(Intercept) 4.28110 4.36191
experience 0.07548 0.07948
I(experience^2) -0.00136 -0.00127
education 0.08297 0.08836
ethnicityafam -0.26951 -0.21739